智能论文笔记

Improving Accuracy Without Losing Interpretability: A ML Approach for Time Series Forecasting

Yiqi Sun , Zhengxin Shi , Jianshen Zhang , Yongzhi Qi , Hao Hu , Zuojun Max Shen

分类：机器学习

2022-12-13

In time series forecasting, decomposition-based algorithms break aggregate data into meaningful components and are therefore appreciated for their particular advantages in interpretability. Recent algorithms often combine machine learning (hereafter ML) methodology with decomposition to improve prediction accuracy. However, incorporating ML is generally considered to sacrifice interpretability inevitably. In addition, existing hybrid algorithms usually rely on theoretical models with statistical assumptions and focus only on the accuracy of aggregate predictions, and thus suffer from accuracy problems, especially in component estimates. In response to the above issues, this research explores the possibility of improving accuracy without losing interpretability in time series forecasting. We first quantitatively define interpretability for data-driven forecasts and systematically review the existing forecasting algorithms from the perspective of interpretability. Accordingly, we propose the W-R algorithm, a hybrid algorithm that combines decomposition and ML from a novel perspective. Specifically, the W-R algorithm replaces the standard additive combination function with a weighted variant and uses ML to modify the estimates of all components simultaneously. We mathematically analyze the theoretical basis of the algorithm and validate its performance through extensive numerical experiments. In general, the W-R algorithm outperforms all decomposition-based and ML benchmarks. Based on P50_QL, the algorithm relatively improves by 8.76% in accuracy on the practical sales forecasts of JD.com and 77.99% on a public dataset of electricity loads. This research offers an innovative perspective to combine the statistical and ML algorithms, and JD.com has implemented the W-R algorithm to make accurate sales predictions and guide its marketing activities.

translated by 谷歌翻译

DAMO-YOLO : A Report on Real-Time Object Detection Design

Xianzhe Xu , Yiqi Jiang , Weihua Chen , Yilun Huang , Yuan Zhang , Xiuyu Sun

分类：计算机视觉

2022-11-23

In this report, we present a fast and accurate object detection method dubbed DAMO-YOLO, which achieves higher performance than the state-of-the-art YOLO series. DAMO-YOLO is extended from YOLO with some new technologies, including Neural Architecture Search (NAS), efficient Reparameterized Generalized-FPN (RepGFPN), a lightweight head with AlignedOTA label assignment, and distillation enhancement. In particular, we use MAE-NAS, a method guided by the principle of maximum entropy, to search our detection backbone under the constraints of low latency and high performance, producing ResNet-like / CSP-like structures with spatial pyramid pooling and focus modules. In the design of necks and heads, we follow the rule of "large neck, small head". We import Generalized-FPN with accelerated queen-fusion to build the detector neck and upgrade its CSPNet with efficient layer aggregation networks (ELAN) and reparameterization. Then we investigate how detector head size affects detection performance and find that a heavy neck with only one task projection layer would yield better results. In addition, AlignedOTA is proposed to solve the misalignment problem in label assignment. And a distillation schema is introduced to improve performance to a higher level. Based on these new techs, we build a suite of models at various scales to meet the needs of different scenarios, i.e., DAMO-YOLO-Tiny/Small/Medium. They can achieve 43.0/46.8/50.0 mAPs on COCO with the latency of 2.78/3.83/5.62 ms on T4 GPUs respectively. The code is available at https://github.com/tinyvision/damo-yolo.

translated by 谷歌翻译

HousE: Knowledge Graph Embedding with Householder Parameterization

Rui Li , Jianan Zhao , Chaozhuo Li , Di He , Yiqi Wang , Yuming Liu , Hao Sun , Senzhang Wang , Weiwei Deng , Yanming Shen

分类：人工智能

2022-02-16

知识图嵌入（KGE）的有效性在很大程度上取决于建模固有关系模式和映射属性的能力。但是，现有方法只能以不足的建模能力捕获其中的一些。在这项工作中，我们提出了一个名为House的更强大的KGE框架，该框架涉及基于两种家庭转换的新型参数化：（1）住户旋转以实现建模关系模式的较高能力；（2）处理复杂关系映射属性的住户预测。从理论上讲，房屋能够同时建模关键的关系模式和映射属性。此外，房屋是对现有基于旋转的模型的概括，同时将旋转扩展到高维空间。从经验上讲，House在五个基准数据集上实现了新的最新性能。我们的代码可在https://github.com/anrep/house上找到。

translated by 谷歌翻译

GiraffeDet: A Heavy-Neck Paradigm for Object Detection

Yiqi Jiang , Zhiyu Tan , Junyan Wang , Xiuyu Sun , Ming Lin , Hao Li

分类：计算机视觉

2022-02-09

在传统的对象检测框架中，从图像识别模型继承的骨干体提取了深层特征，然后颈部模块融合了这些潜在特征，以在不同的尺度上捕获信息。由于对象检测的分辨率比图像识别大得多，因此骨干的计算成本通常主导了总推断成本。这种沉重的背部设计范式主要是由于历史遗产将图像识别模型传输到对象检测时，而不是端到端的优化设计以进行对象检测。在这项工作中，我们表明这种范式确实导致了亚最佳对象检测模型。为此，我们提出了一种新型的重颈范式，长颈鹿，这是一个类似长颈鹿的网络，用于有效的对象检测。长颈鹿使用极轻的骨干和非常深的颈部模块，可同时同时在不同的空间尺度以及不同级别的潜在语义之间进行密集的信息交换。该设计范式允许检测器即使在网络的早期阶段，也可以在相同的优先级处理高级语义信息和低级空间信息，从而使其在检测任务中更有效。对多个流行对象检测基准的数值评估表明，长颈鹿在广泛的资源约束中始终优于先前的SOTA模型。源代码可在https://github.com/jyqi/giraffedet上获得。

translated by 谷歌翻译

PTQ4ViT: Post-Training Quantization Framework for Vision Transformers

Zhihang Yuan , Chenhao Xue , Yiqi Chen , Qiang Wu , Guangyu Sun

分类：计算机视觉

2021-11-24

量化是压缩神经网络最有效的方法之一，这在卷积神经网络（CNNS）上取得了巨大的成功。最近，视觉变压器在计算机视觉中表现出很大的潜力。然而，先前的训练后量化方法在视觉变压器上不良好地执行，即使在8位量化中也导致高精度下降超过1％。因此，我们分析视觉变压器的量化问题。我们观察Softmax和Gelu功能与高斯分布完全不同的激活值的分布。我们还观察到，诸如MSE和余弦距离之类的常见量化度量是不准确的以确定最佳缩放因子。在本文中，我们提出了双均匀的量化方法来减少这些激活值上的量化误差。我们建议使用Hessian的指导指标来评估不同的缩放因子，这提高了校准的准确性，成本小。为了实现Vision变形金刚的快速量化，我们开发了一个有效的框架PTQ4VIT。实验表明，量化的视觉变压器在想象集分类任务上实现了近无损预测准确度（在8位量化的8％量值下降0.5％）。

translated by 谷歌翻译

Localized Graph Collaborative Filtering

Yiqi Wang , Chaozhuo Li , Mingzheng Li , Wei Jin , Yuming Liu , Hao Sun , Xing Xie , Jiliang Tang

分类：人工智能

2021-08-10

建议中的用户项交互可以自然地将其作为用户项二分钟图。鉴于图形表示学习中图形神经网络（GNN）的成功，已提出基于GNN的C方法来推进推荐系统。这些方法通常根据学习的用户和项目嵌入式提出建议。但是，我们发现它们不会在真实建议中表现出很常见的稀疏稀疏用户项目图。因此，在这项工作中，我们介绍了一种新颖的视角，以建立基于GNN的CF方法，了解建议的框架局部图协作滤波（LGCF）。 LGCF的一个关键优势在于它不需要为每个用户和项目学习嵌入，这在稀疏方案中具有挑战性。或者，LGCF旨在将有用的CF信息编码为本地化的图表并基于这些图形提出建议。关于各种数据集的广泛实验验证了LGCF的有效性，尤其是稀疏场景。此外，经验结果表明LGCF为基于嵌入的CF模型提供了互补信息，该模型可用于提高推荐性能。

translated by 谷歌翻译

SEPT: Towards Scalable and Efficient Visual Pre-Training

Yiqi Lin , Huabin Zheng , Huaping Zhong , Jinjing Zhu , Weijia Li , Conghui He , Lin Wang

分类：计算机视觉

2022-12-11

Recently, the self-supervised pre-training paradigm has shown great potential in leveraging large-scale unlabeled data to improve downstream task performance. However, increasing the scale of unlabeled pre-training data in real-world scenarios requires prohibitive computational costs and faces the challenge of uncurated samples. To address these issues, we build a task-specific self-supervised pre-training framework from a data selection perspective based on a simple hypothesis that pre-training on the unlabeled samples with similar distribution to the target task can bring substantial performance gains. Buttressed by the hypothesis, we propose the first yet novel framework for Scalable and Efficient visual Pre-Training (SEPT) by introducing a retrieval pipeline for data selection. SEPT first leverage a self-supervised pre-trained model to extract the features of the entire unlabeled dataset for retrieval pipeline initialization. Then, for a specific target task, SEPT retrievals the most similar samples from the unlabeled dataset based on feature similarity for each target instance for pre-training. Finally, SEPT pre-trains the target model with the selected unlabeled samples in a self-supervised manner for target data finetuning. By decoupling the scale of pre-training and available upstream data for a target task, SEPT achieves high scalability of the upstream dataset and high efficiency of pre-training, resulting in high model architecture flexibility. Results on various downstream tasks demonstrate that SEPT can achieve competitive or even better performance compared with ImageNet pre-training while reducing the size of training samples by one magnitude without resorting to any extra annotations.

translated by 谷歌翻译

Where2comm: Communication-Efficient Collaborative Perception via Spatial Confidence Maps

Yue Hu , Shaoheng Fang , Zixing Lei , Yiqi Zhong , Siheng Chen

分类：计算机视觉

2022-09-26

多代理协作感知可以通过使代理商能够通过交流相互共享互补信息来显着升级感知表现。它不可避免地会导致感知表现与沟通带宽之间的基本权衡。为了解决这个瓶颈问题，我们提出了一个空间置信度图，该图反映了感知信息的空间异质性。它使代理只能在空间上共享稀疏而感知的关键信息，从而有助于沟通。基于这张新型的空间置信度图，我们提出了2Comm，即沟通有效的协作感知框架。其中2Comm具有两个不同的优势：i）它考虑了实用的压缩，并使用较少的沟通来通过专注于感知至关重要的领域来实现更高的感知表现； ii）它可以通过动态调整涉及通信的空间区域来处理不同的通信带宽。要评估2comm的位置，我们考虑了在现实世界和模拟方案中使用两种模式（相机/激光镜头）和两种代理类型（CAR/无人机）的3D对象检测：OPV2V，v2x-sim，dair-v2x和我们的原始的Coperception-uavs。其中2comm始终优于先前的方法；例如，它实现了超过$ 100,000 \ times $较低的通信量，并且在OPV2V上仍然优于脱颖而出和v2x-vit。我们的代码可在https://github.com/mediabrain-sjtu/where2comm上找到。

translated by 谷歌翻译

A Comprehensive Survey on Trustworthy Recommender Systems

Wenqi Fan , Xiangyu Zhao , Xiao Chen , Jingran Su , Jingtong Gao , Lin Wang , Qidong Liu , Yiqi Wang , Han Xu , Lei Chen

分类：人工智能 | 机器学习

2022-09-21

作为最成功的AI驱动应用程序之一，推荐系统的目的是通过在我们生活的许多方面提供个性化建议，以有效而有效的方式帮助人们做出适当的决定，尤其是针对各种面向人类的在线服务，例如E-商务平台和社交媒体网站。在过去的几十年中，推荐系统的快速发展通过创造经济价值，节省时间和精力以及促进社会利益，从而使人类受益匪浅。但是，最近的研究发现，数据驱动的推荐系统可能会对用户和社会构成严重威胁，例如传播虚假新闻以操纵社交媒体网站中的公众舆论，扩大不公平为代表性不足的团体或在工作匹配服务中的个人，或从建议结果中推断隐私信息。因此，系统的可信赖性一直吸引着各个方面的关注，以减轻推荐系统引起的负面影响，以增强公众对推荐系统技术的信任。在这项调查中，我们提供了可信赖的推荐系统（TREC）的全面概述，特别关注六个最重要的方面；即安全与鲁棒性，非歧视与公平，解释性，隐私，环境福祉以及问责制和可审计性。对于每个方面，我们总结了最近的相关技术，并讨论了潜在的研究方向，以帮助未来实现值得信赖的推荐系统。

translated by 谷歌翻译

Learning inverse robot dynamics using sparse online Gaussian process with forgetting mechanism

Wei Li , Zhiwen Li , Yiqi Liu , Yongping Pan

分类：机器人

2022-07-30

通常用于从时间序列数据学习模型的在线高斯流程（GPS）比离线GPS更灵活，更健壮。 GPS的本地和稀疏近似都可以在线有效地学习复杂的模型。但是，这些方法假定所有信号都是相对准确的，并且所有数据都可以学习而无需误导数据。此外，在实践中，GP的在线学习能力受到高维问题和长期任务的限制。本文提出了一个稀疏的在线GP（SOGP），其遗忘机制以特定速度忘记了遥远的模型信息。所提出的方法结合了SOGP基础向量集的两个常规数据删除方案：基于位置信息的方案和最古老的基于点的方案。我们采用我们的方法来学习在任务切换的两部分轨迹跟踪问题下具有7度自由度的协作机器人的逆动力学。模拟和实验都表明，与两种常规数据删除方案相比，所提出的方法可实现更好的跟踪准确性和预测平滑度。

translated by 谷歌翻译